Overview

Dataset statistics

Number of variables12
Number of observations5679597
Missing cells490697
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory520.0 MiB
Average record size in memory96.0 B

Variable types

Numeric7
Categorical5

Alerts

filename has a high cardinality: 1833 distinct values High cardinality
sha256 has a high cardinality: 874949 distinct values High cardinality
imp_hash has a high cardinality: 155717 distinct values High cardinality
sec_md5 has a high cardinality: 1654205 distinct values High cardinality
sec_name has a high cardinality: 37166 distinct values High cardinality
Unnamed: 0 is highly correlated with win_countHigh correlation
win_count is highly correlated with Unnamed: 0High correlation
sec_chi2 is highly correlated with raw_sizeHigh correlation
sec_entropy is highly correlated with raw_size and 1 other fieldsHigh correlation
raw_size is highly correlated with sec_chi2 and 2 other fieldsHigh correlation
virtual_size is highly correlated with sec_entropy and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with win_countHigh correlation
win_count is highly correlated with Unnamed: 0High correlation
Unnamed: 0 is highly correlated with win_countHigh correlation
win_count is highly correlated with Unnamed: 0High correlation
sec_chi2 is highly correlated with raw_sizeHigh correlation
sec_entropy is highly correlated with raw_sizeHigh correlation
raw_size is highly correlated with sec_chi2 and 2 other fieldsHigh correlation
virtual_size is highly correlated with raw_sizeHigh correlation
Unnamed: 0 is highly correlated with win_countHigh correlation
win_count is highly correlated with Unnamed: 0High correlation
raw_size is highly correlated with virtual_size and 1 other fieldsHigh correlation
virtual_size is highly correlated with raw_sizeHigh correlation
virtual_address is highly correlated with raw_sizeHigh correlation
imp_hash has 459864 (8.1%) missing values Missing
sec_chi2 is highly skewed (γ1 = 331.9046384) Skewed
raw_size is highly skewed (γ1 = 325.3469567) Skewed
virtual_size is highly skewed (γ1 = 104.1186351) Skewed
virtual_address is highly skewed (γ1 = 43.51514194) Skewed
Unnamed: 0 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
sec_entropy has 1058492 (18.6%) zeros Zeros
raw_size has 502261 (8.8%) zeros Zeros

Reproduction

Analysis started2022-08-01 04:24:28.489065
Analysis finished2022-08-01 04:27:05.604857
Duration2 minutes and 37.12 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct5679597
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2839798
Minimum0
Maximum5679596
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size43.3 MiB
2022-08-01T14:27:05.738992image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile283979.8
Q11419899
median2839798
Q34259697
95-th percentile5395616.2
Maximum5679596
Range5679596
Interquartile range (IQR)2839798

Descriptive statistics

Standard deviation1639558.573
Coefficient of variation (CV)0.5773504217
Kurtosis-1.2
Mean2839798
Median Absolute Deviation (MAD)1419899
Skewness-7.305472212 × 10-17
Sum1.61289082 × 1013
Variance2.688152314 × 1012
MonotonicityStrictly increasing
2022-08-01T14:27:05.893512image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
37863941
 
< 0.1%
37864021
 
< 0.1%
37864011
 
< 0.1%
37864001
 
< 0.1%
37863991
 
< 0.1%
37863981
 
< 0.1%
37863971
 
< 0.1%
37863961
 
< 0.1%
37863951
 
< 0.1%
Other values (5679587)5679587
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
56795961
< 0.1%
56795951
< 0.1%
56795941
< 0.1%
56795931
< 0.1%
56795921
< 0.1%
56795911
< 0.1%
56795901
< 0.1%
56795891
< 0.1%
56795881
< 0.1%
56795871
< 0.1%

filename
Categorical

HIGH CARDINALITY

Distinct1833
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size43.3 MiB
2022041901/2022041901_3
 
12228
2022042101/2022042101_3
 
11934
2022042101/2022042101_11
 
11928
2022042101/2022042101_10
 
11431
2022042101/2022042101_1
 
10756
Other values (1828)
5621320 

Length

Max length33
Median length24
Mean length25.10842389
Min length23

Characters and Unicode

Total characters142605729
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20220329/2022032900/2022032900_0
2nd row20220329/2022032900/2022032900_0
3rd row20220329/2022032900/2022032900_0
4th row20220329/2022032900/2022032900_0
5th row20220329/2022032900/2022032900_0

Common Values

ValueCountFrequency (%)
2022041901/2022041901_312228
 
0.2%
2022042101/2022042101_311934
 
0.2%
2022042101/2022042101_1111928
 
0.2%
2022042101/2022042101_1011431
 
0.2%
2022042101/2022042101_110756
 
0.2%
2022041900/2022041900_5110703
 
0.2%
20220329/2022032900/2022032900_5410619
 
0.2%
2022041901/2022041901_110605
 
0.2%
20220329/2022032900/2022032900_5710422
 
0.2%
20220329/2022032900/2022032900_5310353
 
0.2%
Other values (1823)5568618
98.0%

Length

2022-08-01T14:27:06.033815image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022041901/2022041901_312228
 
0.2%
2022042101/2022042101_311934
 
0.2%
2022042101/2022042101_1111928
 
0.2%
2022042101/2022042101_1011431
 
0.2%
2022042101/2022042101_110756
 
0.2%
2022041900/2022041900_5110703
 
0.2%
20220329/2022032900/2022032900_5410619
 
0.2%
2022041901/2022041901_110605
 
0.2%
20220329/2022032900/2022032900_5710422
 
0.2%
20220329/2022032900/2022032900_5310353
 
0.2%
Other values (1823)5568618
98.0%

Most occurring characters

ValueCountFrequency (%)
244262028
31.0%
033998767
23.8%
116527510
 
11.6%
912545795
 
8.8%
411803689
 
8.3%
/6489247
 
4.6%
_5679597
 
4.0%
35075636
 
3.6%
52218107
 
1.6%
81394214
 
1.0%
Other values (2)2611139
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130436885
91.5%
Other Punctuation6489247
 
4.6%
Connector Punctuation5679597
 
4.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
244262028
33.9%
033998767
26.1%
116527510
 
12.7%
912545795
 
9.6%
411803689
 
9.0%
35075636
 
3.9%
52218107
 
1.7%
81394214
 
1.1%
71320953
 
1.0%
61290186
 
1.0%
Other Punctuation
ValueCountFrequency (%)
/6489247
100.0%
Connector Punctuation
ValueCountFrequency (%)
_5679597
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common142605729
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
244262028
31.0%
033998767
23.8%
116527510
 
11.6%
912545795
 
8.8%
411803689
 
8.3%
/6489247
 
4.6%
_5679597
 
4.0%
35075636
 
3.6%
52218107
 
1.6%
81394214
 
1.0%
Other values (2)2611139
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII142605729
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
244262028
31.0%
033998767
23.8%
116527510
 
11.6%
912545795
 
8.8%
411803689
 
8.3%
/6489247
 
4.6%
_5679597
 
4.0%
35075636
 
3.6%
52218107
 
1.6%
81394214
 
1.0%
Other values (2)2611139
 
1.8%

win_count
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct613901
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean279245.561
Minimum1
Maximum616221
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.3 MiB
2022-08-01T14:27:06.183003image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile26820
Q1138614
median277341
Q3408122
95-th percentile563291
Maximum616221
Range616220
Interquartile range (IQR)269508

Descriptive statistics

Standard deviation165186.6807
Coefficient of variation (CV)0.591546308
Kurtosis-1.015428343
Mean279245.561
Median Absolute Deviation (MAD)134529
Skewness0.1209900112
Sum1.586002251 × 1012
Variance2.728663947 × 1010
MonotonicityNot monotonic
2022-08-01T14:27:06.336413image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42372297
 
< 0.1%
39517796
 
< 0.1%
40143684
 
< 0.1%
39899683
 
< 0.1%
38657381
 
< 0.1%
38123280
 
< 0.1%
40771378
 
< 0.1%
36356377
 
< 0.1%
37690877
 
< 0.1%
35851975
 
< 0.1%
Other values (613891)5678769
> 99.9%
ValueCountFrequency (%)
111
< 0.1%
28
< 0.1%
313
< 0.1%
415
< 0.1%
59
< 0.1%
65
 
< 0.1%
717
< 0.1%
87
< 0.1%
97
< 0.1%
1012
< 0.1%
ValueCountFrequency (%)
6162213
< 0.1%
6162205
< 0.1%
6162193
< 0.1%
6162187
< 0.1%
6162173
< 0.1%
6162163
< 0.1%
6162155
< 0.1%
6162143
< 0.1%
6162137
< 0.1%
6162124
< 0.1%

sha256
Categorical

HIGH CARDINALITY

Distinct874949
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size43.3 MiB
4d49234ed4c0ec62648d6885a28b992d1dbe6b3fc14f3166f697832e7a1c206c
 
969
32b399bd02caa96103c82b0d886c92707e9acc05be76235ce7ba057802c1a35a
 
867
4f0bdf749061f2e84fe28724f95acef84d9e8fe847710585fc394469bd844238
 
550
f97aef6102ec86f06c1b143a8bea30a250624b5fe243f97c07fda439cb7a21c4
 
488
3c1c6d813d2b031d988204155fc198fe4f32ff56c05dabbcfcd5486131f4fb9d
 
469
Other values (874944)
5676254 

Length

Max length64
Median length64
Mean length64
Min length64

Characters and Unicode

Total characters363494208
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7996 ?
Unique (%)0.1%

Sample

1st rowe0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b
2nd rowe0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b
3rd rowe0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b
4th rowe0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b
5th rowe0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b

Common Values

ValueCountFrequency (%)
4d49234ed4c0ec62648d6885a28b992d1dbe6b3fc14f3166f697832e7a1c206c969
 
< 0.1%
32b399bd02caa96103c82b0d886c92707e9acc05be76235ce7ba057802c1a35a867
 
< 0.1%
4f0bdf749061f2e84fe28724f95acef84d9e8fe847710585fc394469bd844238550
 
< 0.1%
f97aef6102ec86f06c1b143a8bea30a250624b5fe243f97c07fda439cb7a21c4488
 
< 0.1%
3c1c6d813d2b031d988204155fc198fe4f32ff56c05dabbcfcd5486131f4fb9d469
 
< 0.1%
a45c832eed0031c56d51e0ae41a2b1cc597df57ddcf87b89cdab4e9d8ddc9e78468
 
< 0.1%
9d5b4887dd3166f6284b3220de0c77136f3dc795f4d7e40711472f9f24b390f4464
 
< 0.1%
8f6cc96686e671bc8f2d980f39ffe125517c4ce2407755289e52b19cdaee9961448
 
< 0.1%
77980723f53e66234368e2db43fda4e640fcfae134dfdd57c62fb50fd53b2273432
 
< 0.1%
2aba55077ca3974c001f02252f014022620487555be964e615078711a6022e36424
 
< 0.1%
Other values (874939)5674018
99.9%

Length

2022-08-01T14:27:06.483962image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4d49234ed4c0ec62648d6885a28b992d1dbe6b3fc14f3166f697832e7a1c206c969
 
< 0.1%
32b399bd02caa96103c82b0d886c92707e9acc05be76235ce7ba057802c1a35a867
 
< 0.1%
4f0bdf749061f2e84fe28724f95acef84d9e8fe847710585fc394469bd844238550
 
< 0.1%
f97aef6102ec86f06c1b143a8bea30a250624b5fe243f97c07fda439cb7a21c4488
 
< 0.1%
3c1c6d813d2b031d988204155fc198fe4f32ff56c05dabbcfcd5486131f4fb9d469
 
< 0.1%
a45c832eed0031c56d51e0ae41a2b1cc597df57ddcf87b89cdab4e9d8ddc9e78468
 
< 0.1%
9d5b4887dd3166f6284b3220de0c77136f3dc795f4d7e40711472f9f24b390f4464
 
< 0.1%
8f6cc96686e671bc8f2d980f39ffe125517c4ce2407755289e52b19cdaee9961448
 
< 0.1%
77980723f53e66234368e2db43fda4e640fcfae134dfdd57c62fb50fd53b2273432
 
< 0.1%
2aba55077ca3974c001f02252f014022620487555be964e615078711a6022e36424
 
< 0.1%
Other values (874939)5674018
99.9%

Most occurring characters

ValueCountFrequency (%)
622780998
 
6.3%
422743450
 
6.3%
c22734045
 
6.3%
022731443
 
6.3%
f22728833
 
6.3%
522724334
 
6.3%
922723138
 
6.3%
322721024
 
6.3%
122720737
 
6.3%
222712754
 
6.2%
Other values (6)136173452
37.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number227250468
62.5%
Lowercase Letter136243740
37.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
622780998
10.0%
422743450
10.0%
022731443
10.0%
522724334
10.0%
922723138
10.0%
322721024
10.0%
122720737
10.0%
222712754
10.0%
722712523
10.0%
822680067
10.0%
Lowercase Letter
ValueCountFrequency (%)
c22734045
16.7%
f22728833
16.7%
a22711073
16.7%
e22709755
16.7%
b22687751
16.7%
d22672283
16.6%

Most occurring scripts

ValueCountFrequency (%)
Common227250468
62.5%
Latin136243740
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
622780998
10.0%
422743450
10.0%
022731443
10.0%
522724334
10.0%
922723138
10.0%
322721024
10.0%
122720737
10.0%
222712754
10.0%
722712523
10.0%
822680067
10.0%
Latin
ValueCountFrequency (%)
c22734045
16.7%
f22728833
16.7%
a22711073
16.7%
e22709755
16.7%
b22687751
16.7%
d22672283
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII363494208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
622780998
 
6.3%
422743450
 
6.3%
c22734045
 
6.3%
022731443
 
6.3%
f22728833
 
6.3%
522724334
 
6.3%
922723138
 
6.3%
322721024
 
6.3%
122720737
 
6.3%
222712754
 
6.2%
Other values (6)136173452
37.5%

imp_hash
Categorical

HIGH CARDINALITY
MISSING

Distinct155717
Distinct (%)3.0%
Missing459864
Missing (%)8.1%
Memory size43.3 MiB
25c7ac00c91884fd2923a489ae9dfbca
 
361037
dae02f32a21e03ce65412f6e56942daa
 
166563
f34d5f2d4577ed6d9ceec516c1f5a744
 
129697
73effd46557538d5fa5561eee3ffc59c
 
107208
431cb9bbc479c64cb0d873043f4de547
 
106205
Other values (155712)
4349023 

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters167031456
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique81 ?
Unique (%)< 0.1%

Sample

1st row73effd46557538d5fa5561eee3ffc59c
2nd row73effd46557538d5fa5561eee3ffc59c
3rd row73effd46557538d5fa5561eee3ffc59c
4th row73effd46557538d5fa5561eee3ffc59c
5th row73effd46557538d5fa5561eee3ffc59c

Common Values

ValueCountFrequency (%)
25c7ac00c91884fd2923a489ae9dfbca361037
 
6.4%
dae02f32a21e03ce65412f6e56942daa166563
 
2.9%
f34d5f2d4577ed6d9ceec516c1f5a744129697
 
2.3%
73effd46557538d5fa5561eee3ffc59c107208
 
1.9%
431cb9bbc479c64cb0d873043f4de547106205
 
1.9%
8abecba2211e61763c4c9ffcaa13369e96484
 
1.7%
359d89624a26d1e756c3e9d6782d6eb081309
 
1.4%
835a0f00bf1f2c5420f77cabc26e254c66792
 
1.2%
ff0dfa05658a149b7b21130a1a8daedb60810
 
1.1%
9dc46f318397655dea2844d0fd08e2ab50904
 
0.9%
Other values (155707)3992724
70.3%
(Missing)459864
 
8.1%

Length

2022-08-01T14:27:06.602207image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25c7ac00c91884fd2923a489ae9dfbca361037
 
6.9%
dae02f32a21e03ce65412f6e56942daa166563
 
3.2%
f34d5f2d4577ed6d9ceec516c1f5a744129697
 
2.5%
73effd46557538d5fa5561eee3ffc59c107208
 
2.1%
431cb9bbc479c64cb0d873043f4de547106205
 
2.0%
8abecba2211e61763c4c9ffcaa13369e96484
 
1.8%
359d89624a26d1e756c3e9d6782d6eb081309
 
1.6%
835a0f00bf1f2c5420f77cabc26e254c66792
 
1.3%
ff0dfa05658a149b7b21130a1a8daedb60810
 
1.2%
9dc46f318397655dea2844d0fd08e2ab50904
 
1.0%
Other values (155707)3992724
76.5%

Most occurring characters

ValueCountFrequency (%)
c11662116
 
7.0%
211178374
 
6.7%
511058686
 
6.6%
a10979699
 
6.6%
910965193
 
6.6%
410888850
 
6.5%
e10883284
 
6.5%
f10654442
 
6.4%
d10636423
 
6.4%
610238286
 
6.1%
Other values (6)57886103
34.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number103083010
61.7%
Lowercase Letter63948446
38.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
211178374
10.8%
511058686
10.7%
910965193
10.6%
410888850
10.6%
610238286
9.9%
310051469
9.8%
89931185
9.6%
09737381
9.4%
79632810
9.3%
19400776
9.1%
Lowercase Letter
ValueCountFrequency (%)
c11662116
18.2%
a10979699
17.2%
e10883284
17.0%
f10654442
16.7%
d10636423
16.6%
b9132482
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common103083010
61.7%
Latin63948446
38.3%

Most frequent character per script

Common
ValueCountFrequency (%)
211178374
10.8%
511058686
10.7%
910965193
10.6%
410888850
10.6%
610238286
9.9%
310051469
9.8%
89931185
9.6%
09737381
9.4%
79632810
9.3%
19400776
9.1%
Latin
ValueCountFrequency (%)
c11662116
18.2%
a10979699
17.2%
e10883284
17.0%
f10654442
16.7%
d10636423
16.6%
b9132482
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII167031456
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c11662116
 
7.0%
211178374
 
6.7%
511058686
 
6.6%
a10979699
 
6.6%
910965193
 
6.6%
410888850
 
6.5%
e10883284
 
6.5%
f10654442
 
6.4%
d10636423
 
6.4%
610238286
 
6.1%
Other values (6)57886103
34.7%

sec_chi2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1379237
Distinct (%)24.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4307899.078
Minimum-1
Maximum7.531272602 × 1010
Zeros1
Zeros (%)< 0.1%
Negative618253
Negative (%)10.9%
Memory size43.3 MiB
2022-08-01T14:27:06.731809image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q162994
median194133.02
Q31044480
95-th percentile8974431.6
Maximum7.531272602 × 1010
Range7.531272602 × 1010
Interquartile range (IQR)981486

Descriptive statistics

Standard deviation95508577.12
Coefficient of variation (CV)22.17056978
Kurtosis197031.331
Mean4307899.078
Median Absolute Deviation (MAD)194134.02
Skewness331.9046384
Sum2.446713068 × 1013
Variance9.121888303 × 1015
MonotonicityNot monotonic
2022-08-01T14:27:06.867991image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1618253
 
10.9%
1044480249888
 
4.4%
208896095356
 
1.7%
12852244857
 
0.8%
12801541533
 
0.7%
13056037687
 
0.7%
13004936850
 
0.6%
12500136085
 
0.6%
456641.3824145
 
0.4%
51116724014
 
0.4%
Other values (1379227)4470929
78.7%
ValueCountFrequency (%)
-1618253
10.9%
01
 
< 0.1%
0.382
 
< 0.1%
17
 
< 0.1%
27
 
< 0.1%
37
 
< 0.1%
57
 
< 0.1%
93
 
< 0.1%
112
 
< 0.1%
121
 
< 0.1%
ValueCountFrequency (%)
7.531272602 × 10102
< 0.1%
6.134842163 × 10101
< 0.1%
3.617517568 × 10102
< 0.1%
3.28836608 × 10102
< 0.1%
3.231118746 × 10101
< 0.1%
3.229539942 × 10101
< 0.1%
3.219121357 × 10101
< 0.1%
3.128222925 × 10101
< 0.1%
3.106932122 × 10101
< 0.1%
3.097848218 × 10101
< 0.1%

sec_entropy
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct801
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.63292423
Minimum0
Maximum8
Zeros1058492
Zeros (%)18.6%
Negative0
Negative (%)0.0%
Memory size43.3 MiB
2022-08-01T14:27:07.004208image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.43
median4.17
Q35.84
95-th percentile7.64
Maximum8
Range8
Interquartile range (IQR)5.41

Descriptive statistics

Standard deviation2.612051458
Coefficient of variation (CV)0.7189942022
Kurtosis-1.325387784
Mean3.63292423
Median Absolute Deviation (MAD)2.15
Skewness-0.1728711534
Sum20633545.56
Variance6.822812822
MonotonicityNot monotonic
2022-08-01T14:27:07.142793image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01058492
 
18.6%
0.258214
 
1.0%
850329
 
0.9%
0.0849066
 
0.9%
0.148416
 
0.9%
0.0240036
 
0.7%
2.7729342
 
0.5%
7.9929206
 
0.5%
4.3328762
 
0.5%
6.6227952
 
0.5%
Other values (791)4259782
75.0%
ValueCountFrequency (%)
01058492
18.6%
0.016275
 
0.1%
0.0240036
 
0.7%
0.032288
 
< 0.1%
0.042315
 
< 0.1%
0.052561
 
< 0.1%
0.0612113
 
0.2%
0.072653
 
< 0.1%
0.0849066
 
0.9%
0.091505
 
< 0.1%
ValueCountFrequency (%)
850329
0.9%
7.9929206
0.5%
7.9814388
 
0.3%
7.9715702
 
0.3%
7.9611799
 
0.2%
7.9513642
 
0.2%
7.947825
 
0.1%
7.939854
 
0.2%
7.926768
 
0.1%
7.9116574
 
0.3%

sec_md5
Categorical

HIGH CARDINALITY

Distinct1654205
Distinct (%)29.1%
Missing0
Missing (%)0.0%
Memory size43.3 MiB
d41d8cd98f00b204e9800998ecf8427e
618253 
620f0b67a91f7f74151bc5be745b7110
 
249822
0829f71740aab1ab98b33eae21dee122
 
95349
bf619eac0cdf3f68d496ea9344137e8b
 
37300
1f354d76203061bfdd5a53dae48d5435
 
31458
Other values (1654200)
4647415 

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters181747104
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1310775 ?
Unique (%)23.1%

Sample

1st rowa66b3d283f2407d9f7b049b85ce9e049
2nd row0e2b2499a27ab784a2bae9a1dc945fde
3rd row89ce79b3a1e62aeb2ea80a5018651a44
4th row7e016ed8299b52ab729134bb6f3806a0
5th rowff258a9ce39ba902c63a60b25edea110

Common Values

ValueCountFrequency (%)
d41d8cd98f00b204e9800998ecf8427e618253
 
10.9%
620f0b67a91f7f74151bc5be745b7110249822
 
4.4%
0829f71740aab1ab98b33eae21dee12295349
 
1.7%
bf619eac0cdf3f68d496ea9344137e8b37300
 
0.7%
1f354d76203061bfdd5a53dae48d543531458
 
0.6%
026e87d25a05c2499d22d04b55efd3dd24145
 
0.4%
17ba5940940f6b003c9ab874b48d12d423995
 
0.4%
af19baff048a7ca28bd9b67dd9c2fd1c19710
 
0.3%
89ce79b3a1e62aeb2ea80a5018651a4417367
 
0.3%
7e016ed8299b52ab729134bb6f3806a015830
 
0.3%
Other values (1654195)4546368
80.0%

Length

2022-08-01T14:27:07.315459image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
d41d8cd98f00b204e9800998ecf8427e618253
 
10.9%
620f0b67a91f7f74151bc5be745b7110249822
 
4.4%
0829f71740aab1ab98b33eae21dee12295349
 
1.7%
bf619eac0cdf3f68d496ea9344137e8b37300
 
0.7%
1f354d76203061bfdd5a53dae48d543531458
 
0.6%
026e87d25a05c2499d22d04b55efd3dd24145
 
0.4%
17ba5940940f6b003c9ab874b48d12d423995
 
0.4%
af19baff048a7ca28bd9b67dd9c2fd1c19710
 
0.3%
89ce79b3a1e62aeb2ea80a5018651a4417367
 
0.3%
7e016ed8299b52ab729134bb6f3806a015830
 
0.3%
Other values (1654195)4546368
80.0%

Most occurring characters

ValueCountFrequency (%)
013437759
 
7.4%
812776146
 
7.0%
912444981
 
6.8%
412055314
 
6.6%
e12017247
 
6.6%
f11575587
 
6.4%
111528200
 
6.3%
711422870
 
6.3%
d11415579
 
6.3%
b11368496
 
6.3%
Other values (6)61704925
34.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number114269262
62.9%
Lowercase Letter67477842
37.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
013437759
11.8%
812776146
11.2%
912444981
10.9%
412055314
10.5%
111528200
10.1%
711422870
10.0%
211116734
9.7%
510018652
8.8%
69868239
8.6%
39600367
8.4%
Lowercase Letter
ValueCountFrequency (%)
e12017247
17.8%
f11575587
17.2%
d11415579
16.9%
b11368496
16.8%
c10821744
16.0%
a10279189
15.2%

Most occurring scripts

ValueCountFrequency (%)
Common114269262
62.9%
Latin67477842
37.1%

Most frequent character per script

Common
ValueCountFrequency (%)
013437759
11.8%
812776146
11.2%
912444981
10.9%
412055314
10.5%
111528200
10.1%
711422870
10.0%
211116734
9.7%
510018652
8.8%
69868239
8.6%
39600367
8.4%
Latin
ValueCountFrequency (%)
e12017247
17.8%
f11575587
17.2%
d11415579
16.9%
b11368496
16.8%
c10821744
16.0%
a10279189
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII181747104
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
013437759
 
7.4%
812776146
 
7.0%
912444981
 
6.8%
412055314
 
6.6%
e12017247
 
6.6%
f11575587
 
6.4%
111528200
 
6.3%
711422870
 
6.3%
d11415579
 
6.3%
b11368496
 
6.3%
Other values (6)61704925
34.0%

raw_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct32179
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean289463.676
Minimum0
Maximum4294936064
Zeros502261
Zeros (%)8.8%
Negative0
Negative (%)0.0%
Memory size43.3 MiB
2022-08-01T14:27:07.444892image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11024
median4608
Q346080
95-th percentile801280
Maximum4294936064
Range4294936064
Interquartile range (IQR)45056

Descriptive statistics

Standard deviation4918440.912
Coefficient of variation (CV)16.99156516
Kurtosis226659.5363
Mean289463.676
Median Absolute Deviation (MAD)4608
Skewness325.3469567
Sum1.644037026 × 1012
Variance2.419106101 × 1013
MonotonicityNot monotonic
2022-08-01T14:27:07.581989image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
512777672
 
13.7%
4096634499
 
11.2%
0502261
 
8.8%
1024241995
 
4.3%
1536184775
 
3.3%
8192183285
 
3.2%
2048149758
 
2.6%
2560111512
 
2.0%
12288108971
 
1.9%
460878231
 
1.4%
Other values (32169)2706638
47.7%
ValueCountFrequency (%)
0502261
8.8%
11
 
< 0.1%
32
 
< 0.1%
420
 
< 0.1%
514
 
< 0.1%
64
 
< 0.1%
75
 
< 0.1%
841
 
< 0.1%
941
 
< 0.1%
1022
 
< 0.1%
ValueCountFrequency (%)
42949360641
 
< 0.1%
42782679041
 
< 0.1%
24253932961
 
< 0.1%
17021127681
 
< 0.1%
116024988611
< 0.1%
10737500161
 
< 0.1%
9951723521
 
< 0.1%
9882951682
 
< 0.1%
8930467841
 
< 0.1%
7587957761
 
< 0.1%

virtual_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct317607
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean371258.9776
Minimum0
Maximum2147573268
Zeros979
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size43.3 MiB
2022-08-01T14:27:07.720383image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile24
Q11568
median8192
Q363947
95-th percentile1027607.2
Maximum2147573268
Range2147573268
Interquartile range (IQR)62379

Descriptive statistics

Standard deviation5624119.463
Coefficient of variation (CV)15.148777
Kurtosis19906.40757
Mean371258.9776
Median Absolute Deviation (MAD)8092
Skewness104.1186351
Sum2.108601376 × 1012
Variance3.163071973 × 1013
MonotonicityNot monotonic
2022-08-01T14:27:07.856993image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4096165688
 
2.9%
12120785
 
2.1%
2485747
 
1.5%
863996
 
1.1%
154045298
 
0.8%
819243838
 
0.8%
932837938
 
0.7%
834035374
 
0.6%
5550435260
 
0.6%
18027435243
 
0.6%
Other values (317597)5010430
88.2%
ValueCountFrequency (%)
0979
 
< 0.1%
12080
 
< 0.1%
28700
 
0.2%
3874
 
< 0.1%
414686
 
0.3%
5493
 
< 0.1%
640
 
< 0.1%
765
 
< 0.1%
863996
1.1%
932583
0.6%
ValueCountFrequency (%)
21475732681
 
< 0.1%
19510479001
 
< 0.1%
17448383401
 
< 0.1%
17112711322
< 0.1%
13997674801
 
< 0.1%
13879347121
 
< 0.1%
13086307241
 
< 0.1%
10737831681
 
< 0.1%
10506731521
 
< 0.1%
10010910723
< 0.1%

virtual_address
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct24645
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1345613.729
Minimum0
Maximum2425364480
Zeros70
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size43.3 MiB
2022-08-01T14:27:07.991231image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4096
Q124576
median126976
Q3692224
95-th percentile3534848
Maximum2425364480
Range2425364480
Interquartile range (IQR)667648

Descriptive statistics

Standard deviation10317474.8
Coefficient of variation (CV)7.667486276
Kurtosis3992.71949
Mean1345613.729
Median Absolute Deviation (MAD)122880
Skewness43.51514194
Sum7.642543699 × 1012
Variance1.064502863 × 1014
MonotonicityNot monotonic
2022-08-01T14:27:08.130366image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4096932825
 
16.4%
8192169335
 
3.0%
20480116469
 
2.1%
16384111888
 
2.0%
24576104188
 
1.8%
32768103358
 
1.8%
36864100309
 
1.8%
4915293129
 
1.6%
4096089240
 
1.6%
1228875319
 
1.3%
Other values (24635)3783537
66.6%
ValueCountFrequency (%)
070
< 0.1%
21
 
< 0.1%
31
 
< 0.1%
162
 
< 0.1%
2882
 
< 0.1%
3201
 
< 0.1%
39241
< 0.1%
4163
 
< 0.1%
44830
< 0.1%
48014
 
< 0.1%
ValueCountFrequency (%)
24253644801
< 0.1%
19540992001
< 0.1%
19539476481
< 0.1%
19538780161
< 0.1%
17452605441
< 0.1%
17452564481
< 0.1%
17120829442
< 0.1%
17120378882
< 0.1%
14007500801
< 0.1%
13887938561
< 0.1%

sec_name
Categorical

HIGH CARDINALITY

Distinct37166
Distinct (%)0.7%
Missing30833
Missing (%)0.5%
Memory size43.3 MiB
.rsrc
935808 
.text
815013 
.data
690304 
.rdata
653926 
.reloc
578166 
Other values (37161)
1975547 

Length

Max length8
Median length7
Mean length5.277376608
Min length1

Characters and Unicode

Total characters29810655
Distinct characters94
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20846 ?
Unique (%)0.4%

Sample

1st row.text
2nd row.rdata
3rd row.data
4th row.pdata
5th row.rsrc

Common Values

ValueCountFrequency (%)
.rsrc935808
16.5%
.text815013
14.3%
.data690304
12.2%
.rdata653926
11.5%
.reloc578166
10.2%
.idata175808
 
3.1%
.pdata159084
 
2.8%
.tls158860
 
2.8%
.bss80205
 
1.4%
UPX077589
 
1.4%
Other values (37156)1324001
23.3%

Length

2022-08-01T14:27:08.258885image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
rsrc935867
16.6%
text815991
14.4%
data752676
13.3%
rdata671868
11.9%
reloc578189
10.2%
idata176128
 
3.1%
pdata159140
 
2.8%
tls158872
 
2.8%
bss141409
 
2.5%
upx077913
 
1.4%
Other values (36179)1180711
20.9%

Most occurring characters

ValueCountFrequency (%)
.5067513
17.0%
t3806840
12.8%
a3727779
12.5%
r3254926
10.9%
d1987693
 
6.7%
c1667603
 
5.6%
e1584411
 
5.3%
s1429001
 
4.8%
x928079
 
3.1%
l909000
 
3.0%
Other values (84)5447810
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter22178616
74.4%
Other Punctuation5155210
 
17.3%
Uppercase Letter1956479
 
6.6%
Decimal Number480505
 
1.6%
Connector Punctuation34403
 
0.1%
Modifier Symbol2003
 
< 0.1%
Dash Punctuation1082
 
< 0.1%
Math Symbol1027
 
< 0.1%
Close Punctuation587
 
< 0.1%
Open Punctuation505
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t3806840
17.2%
a3727779
16.8%
r3254926
14.7%
d1987693
9.0%
c1667603
7.5%
e1584411
7.1%
s1429001
 
6.4%
x928079
 
4.2%
l909000
 
4.1%
o727111
 
3.3%
Other values (16)2156173
9.7%
Uppercase Letter
ValueCountFrequency (%)
P238203
12.2%
U211432
10.8%
S208766
10.7%
X199173
10.2%
A183458
9.4%
D163530
8.4%
T129967
 
6.6%
E114633
 
5.9%
C97036
 
5.0%
B78781
 
4.0%
Other values (16)331500
16.9%
Other Punctuation
ValueCountFrequency (%)
.5067513
98.3%
/65509
 
1.3%
#13891
 
0.3%
\4517
 
0.1%
:784
 
< 0.1%
@727
 
< 0.1%
?504
 
< 0.1%
!337
 
< 0.1%
*314
 
< 0.1%
%220
 
< 0.1%
Other values (5)894
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0177090
36.9%
1165112
34.4%
236979
 
7.7%
428309
 
5.9%
715264
 
3.2%
914218
 
3.0%
513647
 
2.8%
311063
 
2.3%
89971
 
2.1%
68852
 
1.8%
Math Symbol
ValueCountFrequency (%)
=267
26.0%
+206
20.1%
>187
18.2%
<155
15.1%
|124
12.1%
~88
 
8.6%
Close Punctuation
ValueCountFrequency (%)
]285
48.6%
}154
26.2%
)148
25.2%
Open Punctuation
ValueCountFrequency (%)
{233
46.1%
[137
27.1%
(135
26.7%
Modifier Symbol
ValueCountFrequency (%)
^1840
91.9%
`163
 
8.1%
Connector Punctuation
ValueCountFrequency (%)
_34403
100.0%
Dash Punctuation
ValueCountFrequency (%)
-1082
100.0%
Currency Symbol
ValueCountFrequency (%)
$238
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin24135095
81.0%
Common5675560
 
19.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t3806840
15.8%
a3727779
15.4%
r3254926
13.5%
d1987693
8.2%
c1667603
6.9%
e1584411
 
6.6%
s1429001
 
5.9%
x928079
 
3.8%
l909000
 
3.8%
o727111
 
3.0%
Other values (42)4112652
17.0%
Common
ValueCountFrequency (%)
.5067513
89.3%
0177090
 
3.1%
1165112
 
2.9%
/65509
 
1.2%
236979
 
0.7%
_34403
 
0.6%
428309
 
0.5%
715264
 
0.3%
914218
 
0.3%
#13891
 
0.2%
Other values (32)57272
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII29810655
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.5067513
17.0%
t3806840
12.8%
a3727779
12.5%
r3254926
10.9%
d1987693
 
6.7%
c1667603
 
5.6%
e1584411
 
5.3%
s1429001
 
4.8%
x928079
 
3.1%
l909000
 
3.0%
Other values (84)5447810
18.3%

Interactions

2022-08-01T14:26:38.877755image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:06.340173image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:11.703164image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:16.986972image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:22.543339image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:28.370199image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:33.627257image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:39.632277image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:07.158012image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:12.443398image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:17.769504image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:23.390876image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:29.144734image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:34.375346image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:40.382688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:07.922419image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:13.196996image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:18.546319image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:24.252848image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:29.896104image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:35.134865image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:41.166168image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:08.710927image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:13.979393image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:19.346181image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:25.095517image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:30.657265image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:35.911043image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:41.926511image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:09.469134image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:14.732316image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:20.121957image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:25.940109image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:31.390568image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:36.662694image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:42.672715image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:10.214160image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:15.478182image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:20.890885image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:26.783740image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:32.135318image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:37.391125image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:43.428126image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:10.950826image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:16.228638image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:21.662769image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:27.618998image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:32.884678image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-01T14:26:38.136165image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-08-01T14:27:08.373906image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-01T14:27:08.500524image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-01T14:27:08.628204image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-01T14:27:08.755331image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-01T14:26:45.858892image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-01T14:26:50.403126image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-08-01T14:26:57.268606image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-08-01T14:26:59.874903image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Unnamed: 0filenamewin_countsha256imp_hashsec_chi2sec_entropysec_md5raw_sizevirtual_sizevirtual_addresssec_name
0020220329/2022032900/2022032900_01e0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b73effd46557538d5fa5561eee3ffc59c1267884.256.41a66b3d283f2407d9f7b049b85ce9e0491843201802744096.text
1120220329/2022032900/2022032900_01e0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b73effd46557538d5fa5561eee3ffc59c2956925.754.960e2b2499a27ab784a2bae9a1dc945fde5734455504188416.rdata
2220220329/2022032900/2022032900_01e0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b73effd46557538d5fa5561eee3ffc59c512492.882.6089ce79b3a1e62aeb2ea80a5018651a4440969328245760.data
3320220329/2022032900/2022032900_01e0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b73effd46557538d5fa5561eee3ffc59c917905.194.067e016ed8299b52ab729134bb6f3806a0122888340258048.pdata
4420220329/2022032900/2022032900_01e0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b73effd46557538d5fa5561eee3ffc59c1889.828.00ff258a9ce39ba902c63a60b25edea11031989763198008270336.rsrc
5520220329/2022032900/2022032900_01e0fac1d131a2be9e6332565b4070028212f93728dfb03cc009785b956247847b73effd46557538d5fa5561eee3ffc59c423530.882.89a60e9d10afeb59cf6c54164699e1f074409615403469312.reloc
6620220329/2022032900/2022032900_02b804db8237850c862e428b11f2551709c167257a2676d94fde6624ec60d24a6e9fd5e356cbddf729790916e149772f9d1477174.636.252522b05609ecbf1e7e20bdcff30fbfb62432002428114096.text
7720220329/2022032900/2022032900_02b804db8237850c862e428b11f2551709c167257a2676d94fde6624ec60d24a6e9fd5e356cbddf729790916e149772f9d1864716.385.69be5415110d0b3cb3dd956b370247bf3a123392123135249856.rdata
8820220329/2022032900/2022032900_02b804db8237850c862e428b11f2551709c167257a2676d94fde6624ec60d24a6e9fd5e356cbddf729790916e149772f9d78926.064.92454b4efd187e79002f8c69e43cf3611f51206928376832.data
9920220329/2022032900/2022032900_02b804db8237850c862e428b11f2551709c167257a2676d94fde6624ec60d24a6e9fd5e356cbddf729790916e149772f9d56312.004.8124642b0d9ff9c297eea3fc60729bfd5b20481716385024.rsrc

Last rows

Unnamed: 0filenamewin_countsha256imp_hashsec_chi2sec_entropysec_md5raw_sizevirtual_sizevirtual_addresssec_name
567958756795872022042101/2022042101_4661621909d1535b874f69d30a7edf5fe65715135ae901b4844a7240490c1757f3ffaa3df34d5f2d4577ed6d9ceec516c1f5a74469742.164.267bd915f69bd56aaeeb6863a5cf69504915361536540672.rsrc
567958856795882022042101/2022042101_4661621909d1535b874f69d30a7edf5fe65715135ae901b4844a7240490c1757f3ffaa3df34d5f2d4577ed6d9ceec516c1f5a744128015.000.102ca4b30e551869adf62495beaa322e0751212548864.reloc
567958956795892022042101/2022042101_466162200fa088538c604a23290d3f9376ae009ff2ac078559a3d3d23aab2f4db5ee0ea8f3649b7c05f406bc34ad7e13cbbcc080825516.696.574d45dd25bdeaf30ed5ec2a3dd84c67e01448961446104096.text
567959056795902022042101/2022042101_466162200fa088538c604a23290d3f9376ae009ff2ac078559a3d3d23aab2f4db5ee0ea8f3649b7c05f406bc34ad7e13cbbcc0801574600.004.10e9b9339d80262fb5e3f54fcad20d21f32252822128151552.rdata
567959156795912022042101/2022042101_466162200fa088538c604a23290d3f9376ae009ff2ac078559a3d3d23aab2f4db5ee0ea8f3649b7c05f406bc34ad7e13cbbcc0802279848.002.979598147109ff177a2f0f971f3bd42d651843234984176128.data
567959256795922022042101/2022042101_466162200fa088538c604a23290d3f9376ae009ff2ac078559a3d3d23aab2f4db5ee0ea8f3649b7c05f406bc34ad7e13cbbcc080128681.315.41c8451f9b31623084c6780c0b96d8564a81927870212992.idata
567959356795932022042101/2022042101_466162200fa088538c604a23290d3f9376ae009ff2ac078559a3d3d23aab2f4db5ee0ea8f3649b7c05f406bc34ad7e13cbbcc080706308.944.753277b11fd8259e2e2d59ed356cbfbb071484814588221184.rsrc
567959456795942022042101/2022042101_46616221ea1dbbbe26fd7276d651874dc5d1c8c6475eaa17ce7da2291e4d77ca29a1e1ad25ede10343b691ef6250a90568027455484577.726.541e133243f2ace88d8ba0eac8c1b0d0d887552872444096.text
567959556795952022042101/2022042101_46616221ea1dbbbe26fd7276d651874dc5d1c8c6475eaa17ce7da2291e4d77ca29a1e1ad25ede10343b691ef6250a905680274551242381.256.066e3c9602f6c222bd67ec945c513d2d1039424039423894208.rdata
567959656795962022042101/2022042101_46616221ea1dbbbe26fd7276d651874dc5d1c8c6475eaa17ce7da2291e4d77ca29a1e1ad25ede10343b691ef6250a90568027455554587.693.622998d9d31db5f16bba41eeccbcc4abe2563213472491520.data